AITopics | vector embedding

Collaborating Authors

vector embedding

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Fast Machine Learning Method with Vector Embedding on Orthonormal Basis and Spectral Transform

Lu, Louis Yu

arXiv.org Artificial IntelligenceNov-13-2023

This paper presents a novel fast machine learning method that leverages two techniques: Vector Embedding on Orthonormal Basis (VEOB) and Spectral Transform (ST). The VEOB converts the original data encoding into a vector embedding with coordinates projected onto orthonormal bases. The Singular Value Decomposition (SVD) technique is used to calculate the vector basis and projection coordinates, leading to an enhanced distance measurement in the embedding space and facilitating data compression by preserving the projection vectors associated with the largest singular values. On the other hand, ST transforms sequence of vector data into spectral space. By applying the Discrete Cosine Transform (DCT) and selecting the most significant components, it streamlines the handling of lengthy vector sequences. The paper provides examples of word embedding, text chunk embedding, and image embedding, implemented in Julia language with a vector database. It also investigates unsupervised learning and supervised learning using this method, along with strategies for handling large data volumes.

artificial intelligence, machine learning, vector, (16 more...)

arXiv.org Artificial Intelligence

2310.18424

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.50)

Industry:

Leisure & Entertainment (0.94)
Media > Film (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Vector Embeddings by Sequence Similarity and Context for Improved Compression, Similarity Search, Clustering, Organization, and Manipulation of cDNA Libraries

Um, Daniel H., Knowles, David A., Kaiser, Gail E.

arXiv.org Artificial IntelligenceAug-8-2023

This paper demonstrates the utility of organized numerical representations of genes in research involving flat string gene formats (i.e., FASTA/FASTQ5). FASTA/FASTQ files have several current limitations, such as their large file sizes, slow processing speeds for mapping and alignment, and contextual dependencies. These challenges significantly hinder investigations and tasks that involve finding similar sequences. The solution lies in transforming sequences into an alternative representation that facilitates easier clustering into similar groups compared to the raw sequences themselves. By assigning a unique vector embedding to each short sequence, it is possible to more efficiently cluster and improve upon compression performance for the string representations of cDNA libraries. Furthermore, through learning alternative coordinate vector embeddings based on the contexts of codon triplets, we can demonstrate clustering based on amino acid properties. Finally, using this sequence embedding method to encode barcodes and cDNA sequences, we can improve the time complexity of the similarity search by coupling vector embeddings with an algorithm that determines the proximity of vectors in Euclidean space; this allows us to perform sequence similarity searches in a quicker and more modular fashion.

bioinformatics, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2308.05118

Country:

North America > United States > New York > New York County > New York City (0.05)
South America > Uruguay > Maldonado > Maldonado (0.04)
Africa > Cameroon > Gulf of Guinea (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Meet AI's Multitool: Vector Embeddings - Liwaiwai

#artificialintelligenceApr-6-2022, 12:59:57 GMT

Embeddings are one of the most versatile techniques in machine learning, and a critical tool every ML engineer should have in their toolbelt. It’s a shame, then, that so few of us understand what they are and what they’re good for! The problem, perhaps, is that embeddings sound slightly abstract and esoteric: In machine learning, an embedding ...

embedding, multitool, vector embedding, (4 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.83)

Add feedback